On the Relationship between Phone Phonetic Speaker Recog
نویسندگان
چکیده
Speaker recognition techniques have traditionally relied on purely acoustic features and models. During the last few years, however, the field of speaker recognition has started to show interest in the use of higher level features. In particular, phonetic decodings modeled with statistical language models (n-grams) have already shown its effectiveness in several research works. However, the relationship between phonetic modeling precision and the accuracy of phonetic speaker recognition has not yet been sufficiently analyzed. As part of our preparation for the NIST 2005 speaker recognition evaluation, we have performed a number of experiments that show that there is a negligible correlation between phonetic modeling precision and phonetic speaker recognition accuracy. Furthermore, our experimental results show that phonetic speaker recognition results may even be better when using phonetic decodings in languages different from that of the speech.
منابع مشابه
Speaker verification based on broad phonetic categories
In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability...
متن کاملPhonetic speaker recognition using maximum-likelihood binary-decision tree models
Recent work in phonetic speaker recognition has shown that modeling phone sequences using n-grams is a viable and effective approach to speaker recognition, primarily aiming at capturing speaker-dependent pronunciation and also word usage. This paper describes a method involving binary-tree-structured statistical models for extending the phonetic context beyond that of standard n-grams (particu...
متن کاملConstrained Subword Units for Speaker Recognition
Phonetic features have been proposed to overcome performance degradation in spectral speaker recognition in difficult acoustic conditions. The harmful effect of those conditions, however, is not restricted to spectral systems but also affects the performance of the open-loop phone recognisers on which phonetic systems are based. In automatic speech recognition, larger subword units and the use ...
متن کاملPhonetic vocoding with speaker adaptation
This paper describes a phonetic vocoding scheme which relies on speaker adaptation to capture important speaker characteristics. These are typically lost in phonetic vocoders which transmit only information about the phones which are recognized, together with some prosodic information. In our scheme, however, additional speaker characteristics are transmitted in vowel regions (average values of...
متن کاملUsing phone log-likelihood ratios as features for speaker recognition
The so called Phone Log-Likelihood Ratio (PLLR) features, computed on phone posterior probabilities provided by phonetic decoders, convey acoustic-phonetic information in a sequence of frame-level vectors. Thus, PLLRs can be easily plugged into traditional acoustic systems just by replacing MFCCs, PLPs or whatever other representation. PLLR features were used under an iVector-PLDA approach in o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005